Skip to content

destroy request on error to avoid hanging sockets#1141

Open
benzekrimaha wants to merge 5 commits intodevelopment/1.3from
improvement/HD-4352-properly-close-sockets
Open

destroy request on error to avoid hanging sockets#1141
benzekrimaha wants to merge 5 commits intodevelopment/1.3from
improvement/HD-4352-properly-close-sockets

Conversation

@benzekrimaha
Copy link
Copy Markdown
Contributor

@benzekrimaha benzekrimaha commented Mar 16, 2026

On timeout or other errors the socket was left open; we now destroy the request so the socket is properly closed.
Issue: HD-4352

On timeout or other errors the socket was left open; destroy the
request so the socket is properly closed.
@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Mar 16, 2026

Hello benzekrimaha,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@benzekrimaha benzekrimaha changed the base branch from development/1 to development/1.3 March 16, 2026 07:58
@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Mar 16, 2026

Request integration branches

Waiting for integration branch creation to be requested by the user.

To request integration branches, please comment on this pull request with the following command:

/create_integration_branches

Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.

@scality scality deleted a comment from bert-e Mar 16, 2026
@benzekrimaha benzekrimaha marked this pull request as ready for review March 16, 2026 07:59
@benzekrimaha benzekrimaha requested review from a team, SylvainSenechal and maeldonn March 16, 2026 07:59
Issue: HD-4352
Copy link
Copy Markdown

@maeldonn maeldonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test that reproduces the original problem (socket left open on error) and verifies it is now fixed?

@SylvainSenechal
Copy link
Copy Markdown

Could you add a test that reproduces the original problem (socket left open on error) and verifies it is now fixed?

Yeah if you are able to add a little test that counts connections or something it's nice, but might be hard I don't know

}).on('error', (err) => {
if (!callbackCalled) {
callbackCalled = true;
request.destroy(err);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick check on this, and I think when we reach the error callback of the request, the socket is already destroyed, so maybe here only need to destroy on line 238 ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We keep the request.destroy(err)here on purpose:
Node can emit an error after the response callback (e.g., keep‑alive timeout or reset on an already-ended request). In that path the socket is sometimes still in the agent pool; explicitly destroying guarantees the FD is closed and doesn’t leak.
ClientRequest.destroy() is idempotent; if the socket was already torn down, this is a no‑op.
We have regression tests covering the post-response error and stream-error paths to ensure destroy is called and sockets don’t accumulate here: 6d878a0

@benzekrimaha benzekrimaha force-pushed the improvement/HD-4352-properly-close-sockets branch 2 times, most recently from ce87e45 to 9cc19c9 Compare March 23, 2026 16:16
Stub http.request to emit a synthetic error after response; assert
destroy() is invoked with the expected err.code.

Issue: HD-4352
@benzekrimaha benzekrimaha force-pushed the improvement/HD-4352-properly-close-sockets branch from 9cc19c9 to 7ab390a Compare March 23, 2026 16:16
Copy link
Copy Markdown

@maeldonn maeldonn Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests cover the post-response error cases well, but they all go through GET which never hits the stream branch in _handleRequest. The change from request.end to request.destroy at line 285 only runs when a stream is piped to the request, and nothing in the tests exercises that path.

It would be good to have a test that passes a stream that errors mid-transfer (e.g. through PUT) and checks that request.destroy gets called.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here : 6d878a0

Copy link
Copy Markdown

@maeldonn maeldonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @SylvainSenechal comment is right, we're missing the most important test.

@benzekrimaha benzekrimaha force-pushed the improvement/HD-4352-properly-close-sockets branch from 9502b41 to d80775c Compare March 30, 2026 12:54
@benzekrimaha benzekrimaha requested a review from maeldonn March 30, 2026 12:54
Assert destroy call counts in post-response and stream-error paths to guard against double-destroy regressions.

Issue: HD-4352
@benzekrimaha benzekrimaha force-pushed the improvement/HD-4352-properly-close-sockets branch from 1df5dd9 to 6d878a0 Compare March 30, 2026 13:09
Copy link
Copy Markdown

@maeldonn maeldonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just two small comments/questions.

const afterSockets = countAgentSockets((h as unknown as { httpAgent: unknown }).httpAgent);
try {
assert.ok(
afterSockets <= baselineSockets + 2,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why +2 ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the loop runs, the agent can temporarily have one live socket plus one free socket in its pools (keep‑alive reuse or a race between response and the post‑response. The bound is just to ensure we’re not leaking; it tolerates at most one extra active + one free beyond the baseline.

}) as typeof http.request);
}

function stubRequestCaptureDestroy(assignCapture: (cap: DestroyCapture) => void): void {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems that there is some duplicated logic between stubRequestCaptureDestroy and stubRequestWithPostResponseError.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed they share most of the setup. I kept two helpers because one injects a post‑response error and the other only captures destroy, but I can fold them into a single helper with an optional post‑response error injection flag to remove the duplication

Unify destroy-capture stubs with optional post-response error injection to reduce duplication and keep tests clearer.

Issue: HD-4352
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants